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What  is  Network  Flow? 


•  A  log  of  all  network  activity 

•  Not  a  recording  of  all  packets 

•  A  record  of  metadata  from  related  packets 

•  Similar  to  a  phone  bill  (call  detail  record) 

•  Content  of  messages  is  not  recorded 

•  Much,  much  more  compact 

•  Can  retain  longer 

•  Less  processing 

•  Increased  privacy 
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What  SiLK  Does 


Retrospective  analysis 

•  most  useful  for  analysing  past  network  events 

•  may  feed  an  automated  report  generator 

•  good  for  forensics  (what  happened  before  the  incident?) 

Descriptive  analysis  -  profiling/categorizing 
Exploratory  analysis  -  looking  for  the  unusual 
Optimized  for  extremely  large  data  collections 

•  Very  compact  record  format 

•  Large  amount  of  history  can  stay  online. 

•  Can  be  processed  much  more  quickly  than  packets 
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Modes  of  Inquiry 

. ■«# 

Detect 

Discover 

example:  Snort® 

example:  SiLK™ 

Operate  like  Tech  Support 

Operate  like  Quality  Assurance 

Response 

Exploring 

Alert 

Question 

Relax  between  detections 

Continuous  security  improvement 

Produce  new  indicators 

Shorten  detect/response  time 

Snort  is  a  registered  trademark  of  Cisco  and/or  its  affiliates 

SiLK  is  a  trademark  of  Carnegie  Mellon  University 

1  Software  Engineering  Institute  Carnegie  Mellon 
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Got  a  Question?  Flow  Can  Help 

What’s  on  my  network? 

What  happened  before  the  event? 

Where  are  policy  violations  occurring? 

What  are  the  most  popular  web  servers? 

By  how  much  would  volume  be  reduced  with  a  blacklist? 

Do  my  users  browse  to  known  infected  web  servers? 

Do  I  have  a  spammer  on  my  network? 

When  did  my  web  server  stop  responding  to  queries? 

Who  uses  my  public  servers? 

Software  Engineering  Institute  CarmgKlVfellon  ©2014  Carnegie  Mellon  University  8 
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Unidirectional  Flows  (Uniflows) 
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Packet  Encapsulation 


Dest  mac  address  ip  datagram  (packet) 


Source  MAC  addr  Src  IP  address 

Type  of  packet  Dst  IP  address 

Type  of 
segment 


Transport  segment 


Src  port 
Dest  port 
Flags 


r 

Application 
layer  message 
(HTTP,  SMTP, 
DNS) 

L 


(cert 


Software  Engineering  Institute  Carnegie  Mellon 


©  2014  Carnegie  Mellon  University 


11 


Two  TCP/IP  Sockets 
Make  a  Connection 


TCP/IP  SOCKET 

IP  address:  10.0.0.1 
L4  protocol:  TCP 
High-numbered 
ephemeral  port  # 


TCP/IP  SOCKET 

IP  address:  203.0.113.1 
L4  protocol:  TCP 
Low-numbered  Well- 
Known-Port  # 


Client  Connection  Server 
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Network  Flow  versus  NetFlow 


Network  Flow — a  generic  term  for  the  summarization 
of  packets  related  to  the  same  flow  or  connection  into 
a  single  record 

NetFlow™ — A  Cisco  trademarked  set  of  format 
specifications  for  storing  network  flow  information  in  a 
digital  record 

IPFIX — a  format  specification  from  the  IETF  for  flow 
records,  an  extension  of  Cisco  NetFlow  v9 

SiLK — Another  set  of  format  specifications  for  flow 
records  and  other  related  data,  plus  the  tool  suite  to 
process  that  data 
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What’s  in  a  Record? 


Fields  found  to  be  useful  in  analysis: 

•  source  address,  destination  address 

•  source  port,  destination  port  (Internet  Control  Message 
Protocol  [ICMP]  type/code) 

•  IP  [transport]  protocol 

•  bytes,  packets  in  flow 

•  accumulated  TCP  flags  (all  packets,  first  packet) 

•  start  time,  duration  (milliseconds) 

•  end  time  (derived) 

•  sensor  identity 

•  flow  termination  conditions 

•  application-layer  protocol 
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DNS  packets 
viewed  in  Wireshark 


File  Edit  View  Go  Capture  Analyze  Statistics  Telephony  Tools  Internals  Help 


@  S  1  9,  * 

¥  2  ifala 

Filter 

j~^~[  Expression. 

Clear  Apply 

No.  Time  Source 

Destination 

Protocol 

Length  Info 

1  0.000000  192.168.1.105 

10.1.10.1 

DNS 

78  Standard 

query  A  www.mudynamics.com 

2  0.  348077  10.1.10.1 

192.168.1.105 

DNS 

94  standard 

query  response  A  69.55.232.156 

<  | _ 

rrr 

► 

E)  Frame  2:  94  bytes  on  wire  (752  bits),  94  bytes  captured  (752  bits) 

E)  Ethernet  II,  Src:  Ci sco-Li_66 : ae :1c  (00:la:70:66:ae:lc) ,  Dst:  Appl ecom_d3 : 9a: b8  (00 :19 : e3 : d3 : < 
E)  Internet  Protocol  version  4,  Src:  10.1.10.1  (10.1.10.1),  Dst:  192.168.1.105  (192.168.1.105) 

E)  User  Datagram  Protocol,  Src  Port:  domain  (53),  Dst  Port:  50744  (50744) 

E)  Domain  Name  system  (response) 

<  I  m  I  > 


0000 

0010 

0020 

0030 

0040 

0050 


Wireshark  is  a  registered  trademark  of  the  Wireshark  Foundation 
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TIM 


Sequence  Diagram 


DNS  Client 
192.168.1.105 
UDP  port  50744 


DNS  Server 
10.1.10.1 
UDP  port  53 
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SiLK  tool  (rwcut)  output 


SlP| 

dIP 

|  sPort | 

|  dPort | pro | packets 

|  bytes | 

|  sensor | type | 

192.168.1.1051 

10.1.10.1 

150744 | 

|  53 |  17 |  1 

1  64| 

|  SI |  out| 

10.1.10.11 

192.168.1.105 

1  53 1 

|50744|  17 |  1 

1  80| 

1  SI |  in | 

r\ 

(cert 
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Network  Monitoring 


iSiLK 


QEPj  Software  Engineering  Institute  (Carnegie Mellon 


iSiLK  is  a  trademark  of  Carnegie  Mellon  University 
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Realistic  Sequence  Diagram 


DNS  Client  Local  Server  Sensor 
192.168.1.105  10.1.10.1 

UDP  port  50744 
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More  Realistic  Sequence  Diagram 


DNS  Client  Local  Server  NAT  Sensor 
192.168.1.105  10.1.10.1 

UDP  port  50744 
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What  is  this?  —  1 


sIP|  dip | sPort 

192.168.1.1051  10.1.10.1|50744 

10.1.10.1|192.168.1.105|  53 

192.168.1.1051  198.51.100.6|49152 
198.51.100.6|192.168.1.105|  80 


dPort | pro | packets | flags | 

initF | 

type 

53  | 

17  | 

H 

1 

1 

out 

50744  | 

17  | 

H 

1 

1 

in 

80  | 

6  | 

4| 

SRPA  | 

s  | 

outweb 

49152 | 

6| 

3  | 

S  PA  | 

S  A  | 

in  web 

(cert 
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HTTP  Sequence  Diagram 


HTTP  Client  HTTP  Server  DNS  Server 

192.168.1.105  198.51.100.6  10.1.10.1 


What  Is  This?  —  2 


sIP|  dIP | sPort | dPort | pro | packets |  bytes | flags | 


30.22.105.2501 

71 . 55 . 40 . 253 | 52415 | 

25  | 

6| 

22  | 

14045 |FSRPA| 

71. 55. 40. 253 |30.22. 105. 250|  25 | 52415 | 

61 

19  | 

1283 |FS  PA | 

30.22.105.2501 

71 . 55 . 40 . 253 | 52415 | 

25  | 

61 

H 

40 1  R  | 

r*  i  = 

(cert  ~ 
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What  Is  This?  —  3 


SIP 

1 

dIP 

|pro 

|  packets 

|  bytes 

|  sTime | 

99.217.139.155 

|  177.252.24 

.89 

|  1 

1  2 

|  122 

| 2010/12/08T00 : 04 : 30 . 172 | 

99.217.139.155 

1177.252.149. 

249 

|  1 

1  2 

|  122 

|  2010/12/08T00 : 04 : 37 . 302 | 

99.217.139.155 

|  177.252.24 

.52 

|  1 

1  2 

|  122 

|  2010/12/08T00 : 04 : 37 . 312 | 

99.217.139.155 

|  177.252.24. 

127 

|  1 

1  2 

|  122 

|  2010/12/08T00 : 04 : 58 . 363 | 

99.217.139.155 

|  177.252.24. 

196 

|  1 

1  2 

|  122 

|  2010/12/08T00 : 05 : 04 . 327 | 

99.217.139.155 

|  177.252.149 

.30 

|  1 

1  2 

|  122 

|  2010/12/08T00 : 05 : 09 . 242 | 

99.217.139.155 

1177.252.149. 

173 

|  1 

1  2 

|  122 

|  2010/12/08T00 : 05 : 12 . 174 | 

99.217.139.155 

|  177.252.24 

.13 

|  1 

1  2 

|  122 

|  2010/12/08T00 : 05 : 14 . 114  | 

99.217.139.155 

|  177.252.24 

.56 

|  1 

1  2 

|  122 

|  2010/12/08T00 : 05 : 15 . 383 | 

99.217.139.155 

|  177.252.24. 

114 

|  1 

1  2 

|  122 

|  2010/12/08T00 : 05 : 18 . 228 | 

99.217.139.155 

|  177.252.202 

.92 

|  1 

1  2 

|  122 

|  2010/12/08T00 : 05 : 22 . 466 | 

99.217.139.155 

|  177.252.202 

.68 

|  1 

1  2 

|  122 

|  2010/12/08T00 : 05 : 23 . 497 | 

99.217.139.155 

|  177.252.24. 

161 

|  1 

1  2 

|  122 

|  2010/12/08T00 : 05 : 30 . 256 | 

99.217.139.155 

1177.252.202. 

238 

|  1 

1  2 

|  122 

|  2010/12/08T00 : 05 : 33 . 088  | 

r\ 

'CERT 
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What  Is  This?  —  4 


SIP 

|  dIP 

|  sPort 

|  dPort 

|  pkts 

88.187.13.78 

171.55.40.204 

140936 

|  80 

|  83 

71.55.40.204 

188.187.13.78 

|  80 

140936 

|  84 

88.187.13.78 

171.55.40.204 

|  40938 

|  80 

|  120 

71.55.40.204 

188.187.13.78 

|  80 

|  40938 

|  123 

88.187.13.78 

171.55.40.204 

156172 

|  80 

|  84 

71.55.40.204 

188.187.13.78 

|  80 

156172 

|  83 

88.187.13.78 

171.55.40.204 

156177 

|  80 

|  123 

71.55.40.204 

188.187.13.78 

|  80 

156177 

|  124 

i  = 

(cert  ~ 
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It’s  All  a  Matter  of  Timing 


The  flow  buffer  needs  to  be  kept  manageable. 

Idle  timeout 

■  If  there  is  no  activity  within  30  seconds  (configurable),  flush 
the  flow. 

Active  timeout 

■  Flush  all  flows  open  for  30  minutes  (configurable). 


Flow  1 


65s 


30  min 


► 


Flow  2  Flow  3 
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SiLK  Types 


outweb,  outicmp,  out 
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SiLK  Types  in  SiLK 


Type 

Description 

inweb,  outweb 

Inbound/outbound  TCP  ports  80,  443,  8080 

innull,  outnull 

Inbound/outbound  filtered  traffic 

inicmp,  outicmp 

Inbound/outbound  IP  protocol  1 

in,  out 

Inbound/outbound  not  in  above  categories 

int2int,  ext2ext 

Internal  to  internal,  external  to  external 

other 

Source  not  internal  or  external,  or 
destination  not  internal,  external,  or  null 

Names  in  bold  are  default  types 
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Outline  —  3 
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UNIX  /  Linux  commands 


System  prompt 

Info  +  prompt  character 
e.g.,  ~  1 01  > 

User  command 

command  name  rwf ilter  (case  sensitive) 

options  -h  — help  -k2  --key =2 

arguments  results .  rw 

redirections  >  »  < 

pipe  | 

For  example: 

rwcut  --all-fields  results . rw  >results . txt 
rwcut  --fields=l-6  results . rw  |  more 

Linux  is  the  registered  trademark  of  Linus  Torvalds  in  the  U.S.  and  other  countries 
UNIX  is  a  registered  trademark  of  The  Open  Group 
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Some  standard  Linux  commands 

Is  -  list  name  &  attributes  of  files  and  directories 

cd  -  change  the  current  working  directory 

cat  -  output  the  contents  of  a  file 

more  and  less  -  display  a  file  one  page  at  a  time 

cut  -  output  only  selected  fields  of  a  file 

sort  -  reorder  the  records  (lines)  of  a  file 

wc  -  word  count  (optionally,  line  count)  of  a  file 

exit  -  logout  &  terminate  a  terminal  window 
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Linux  Standard  symbolic  files 

Standard  In  (stdin)  -  where  normal  (especially 
interactive)  input  comes  from 

Standard  Out  (stdout)  -  where  normal/expected 
(especially  interactive)  output  goes  to 

Standard  Error  (stderr)  -  where  messages 
(especially  unexpected)  go  to 

Defaults: 

stdin  -  keyboard 
stdout  -  screen/window 
stderr  -  screen/window 

Defaults  are  overridden  by  redirections  and  pipes 
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Shell  Scripts 


Put  a  complicated  command,  pipeline,  or  sequence 
of  pipelines  into  a  script  file. 

•  It  saves  your  commands  for  reuse  or  learning 

•  It  eases  making  changes 

Use  the  GUI  editor  gedit,  or  the  simple  character 
editors  joe  and  nano  when  on  a  SSH  connection. 
Use  vi  (vim)  to  earn  your  geek  badge.  Vi  or  vim 
can  be  found  on  every  Linux/UNIX  system. 

Name  your  shell  script  something  like  do  this .  sh 

Execute  (run)  your  script:  .  /dothis .  sh 

gedit  is  the  registered  trademark  of  Interactive  Graphic  Systems,  Inc. 

SSH  is  the  registered  trademark  of  SSH  Communications  Security  Corp 
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Collection,  Packing,  and  Analysis 


Collection  of  flow  data 

•  Examines  packets  and  summarizes  into  standard  flow 
records 

•  Timeout  and  payload-size  values  are  established  during 
collection 

Packing  stores  flow  records  in  a  scheme  optimized 
for  space  and  ease  of  analysis 

Analysis  of  flow  data 

•  Investigation  of  flow  records  using  SiLK  tools 
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Collection 


Idle-timeout, 

Active-timeout 


v 


PCAP 


YAF 


'CERT 
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Termination-attribute, 
Application,  Start-time, 
Duration,  Packets, 
Bytes,  Flags... 


IPFIX 
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Packing 


/  V 


IPFIX 


/  V 


Cisco 

NetFlow 


w 


Sensor, 

Class, 

Type 
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SiLK 


RootDir 


Repository 


(  vj - k 

(  vj - k 

(  vJ - k 

(  vJ - k 

(  vJ - k 

(  vJ - k 

in 

inweb 

int2int 

out 

outweb 

ext2ext 

1 

l 

l 

l 

l 

1 

~v 


year 


j 


a 


I 


month 


j 
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type-SENSOR_yyyymmdd.hh 
e.g.,  in-SEN1_20091231.23 

©  2014  Carnegie  Mellon  University 


39 


Linux  Exercise 


PS1='\W  \!>  ' 

export  SILK_IPV6_P0LICY=asv4 
cd  /data/bluered 
Is  -1  silk.conf 

less  silk.conf  #  type  "q"  to  exit  from  less 
cd 
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Analysis 


SiLK 

repository 


Y 


Raw  (binary) 
flow  records 
in  a  file 


SiLK 

tool 

chain 
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Reporting 


Text 


(  > 

UNIX  text 

Text 

tools 

(sed,  awk,  ...) 

1  J 

1 

f 

\ 

Visualization 
tools  (gnuplot, 
Rayon,  Excel) 


Rayon  is  a  trademark  of  Carnegie  Mellon  University 
Excel  is  a  registered  trademark  of  Microsoft  Corporation 
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So  Much  to  Do,  So  Little  Time... 


We  can’t  discuss  all  parameters  for  every  tool. 
Resources 

•  Analyst’s  Handbook 

•  SiLK  Reference  Guide  (hard-copy  man  pages) 

•  --help  option 

•  man  command 

•  http://tools.netsa.cert.org 
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What  sensors  are  defined? 


rwsiteinfo  - -fields=id-sensor, sensor 

rwsiteinfo  - -f ields=id-sensor, sensor, \ 
describe-sensor 
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Basic  SiLK  Tools:  rwf ileinfo 


rwf ileinfo  displays  a  variety  of  characteristics  for 
each  file  format  produced  by  the  SiLK  tool  suite. 

It  is  very  helpful  in  tracing  how  a  file  was  created  and 
where  it  was  generated. 
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rwf  ileinfo  Example 


[ liveuser@livecd  ~]$  rwfilter  --sensor=S0  --type=in , out  \ 
--start=2009/4/21T15  --protocol=l  \ 

--pass=icmprecords . rw 


[ liveuser@livecd  ~]$  rwf ileinfo  icmprecords . rw 
icmprecords . rw : 
format (id) 
version 
byte-order 
compression (id) 
header-length 
record-length 
record-version 
silk- version 
count-records 
file-size 
command- lines 

1  rwfilter  --sensor=S0  -~type=in, out 
--start=2009/4/21T15  --protocol=l  --pass=icmprecords . rw 


FT_RWIPV6R0UTING (OxOC) 
16 

littleEndian 
lzolx (2) 

176 

88 

1 

3.9.0 

39 

963 


fr.  FRT 


CERT  l  ~ 
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rwfileinfo  —fields 


All  fields  available  to  display 


1  format (id) 

2  version 

3  byte-order 

4  compression (id) 

5  header- length 

6  record-length 

7  count-records 

8  file-size 

9  command-lines 


10  record-version 

11  silk-version 

12  packed- file- info 

13  probe -name 

14  annotations 

15  prefix -map 

16  ipset 

17  bag 
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Basic  SiLK  Tools:  rwcut 


But  I  can’t  read  binary... 

rwcut  provides  a  way  to  display  binary  records  as 
human-readable  ASCII: 

•  useful  for  printing  flows  to  the  screen 

•  useful  for  input  to  text-processing  tools 

•  Usually  you’ll  only  need  the  — fields  option. 


sip 

packets 

type 

flags 

dip 

bytes 

in 

initialflags 

sport 

sensor 

out 

sessionflags 

dport 

see 

dur 

application 

protocol 

dec 

stime 

attributes 

class 

nhip 

etime 

itype  &  icode 

Field  names  in  italics  are  derived  fields 
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rwcut  Default  Display 


By  default 

•  sIP,  sPort 

•  dIP,  dPort 

•  protocol 

•  packets,  bytes 

•  flags 

•  sTime,  eTime,  duration 

•  sensor 

--all-fields 


i  = 

(cert  ~ 
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Pretty  Printing  SiLK  Output 


Default  output  is  fixed-width,  pipe-delimited  data. 


sIP|  dIP | pro | pkts | bytes | 


207.240.215.711 

128.3.48. 

203| 

n 

n 

60  | 

207.240.215.711 

128.3.48 

.68| 

n 

n 

60  | 

207.240.215.711 

128.3.48 

.71| 

n 

n 

60  | 

Tools  with  text  output  have  these  formatting  options: 

•  — no-titles:  suppress  the  column  headings 

•  — no-columns:  suppress  the  spaces 

•  - -column- separator :  just  change  the  bar  to 
something  else 

•  — delimited:  combine  above  3  options 

•  — legacy-timestamps:  better  for  import  to  Excel 
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What  do  the  data  look  like? 

rwcut  icmprecords . rw  --fields=l-6 


Try  other  values  for  -  -fields. 
Try  omitting  the  -  -fields  option. 
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Why  do  we  need  rwcut? 


cd 

rwfilter  --type=in  \ 

--start-d=2009/4/2lTl5  --proto=0-  \ 
--compress=none  \ 

- -pass-dest=t20 . rw  --max-pass=20 
Is  -1  t20.rw 
rwfileinfo  t20.rw 

hexdump  -C  t20.rw  #  any  readable  text? 

rwcut  --fields=l-6  t20.rw 
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Basic  SiLK  Tools:  rwsort 


Why  sort  flow  records? 

•  Records  are  recorded  as  received,  not  necessarily  in 
time  order. 

•  Analysis  often  requires  finding  outliers. 

•  You  can  also  sort  on  other  fields  such  as  IP  address  or 
port  to  easily  find  scanning  patterns. 

•  It  allows  analysts  to  find  behavior  such  as  beaconing  or 
the  start  of  traffic  flooding. 
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rwsort  Options 


— fields  (same  as  rwcut)  is  required. 

Input  files  are  specified  as  positional  arguments  (default  is  stdin). 

--output-path=  specifies  the  output  file  (default  is  stdout.) 

For  improved  sorts,  specify  a  buffer  size  with  --sort-buffer-size=. 

For  large  sorts,  specify  a  temporary  directory  with  --te m p-di recto ry=. 
Temporary  files  stored  in  /tmp  by  default 

rwsort  t20.rw  — fields=stime  \ 

--output-pa th=t20bystime . rw 

rwsort  t20.rw  — fields=sip, sport ,dport  \ 

|  rwuniq  — fields=sip, sport ,dport  --presorted  \ 

— value=dip-distinct 
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Basic  SiLK  Tools:  rwf ilter 


Pick  files  from  the  repository 


Advanced  flow-by-flow  filtering 


Compression 


Basic 

statistics 


Plug  in 

additional 

tools 


Direct  flow 
output 


Swiss  Army  knife  logo  is  a  registered  trademark  of  Victorinox  AG 


Software  Engineering  Institute  Carnegie  Mellon 


©  2014  Carnegie  Mellon  University 


rwf  ilter  Syntax 


General  form 

rwfilter  {INPUT  |  SELECTION} 

PARTITION  OUTPUT  [OTHER] 

Example  call 

rwfilter  — sensor=S0  --type=in  \ 
--start-date=2009/4/2lT9  \ 
--end-date=2009/4/2lTl6  \ 
--protocol=0-255  --pass=workday-21 . rw 
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rwf ilter  Command  Structure 


The  rwfilter  command  requires  three  basic  parts: 

•  selection  criteria  or  input  criteria  (which  files  are  input?) 
repository:  class,  sensor,  type,  start/end  date/hour 


•  Partition  (which  records  pass  my  criteria?  Which  fail?) 


Selection  and  Input  Criteria 


Selection  options  control  access  to  repository  files: 

•  --start-date=2009/4/21 

•  --end-date=2009/4/2lT03 

•  --sensor=S0 

•  --class=all 

•  — type=in , inweb 

Alternatively,  use  input  criteria  for  a  pipe  or «  r" 


myfile . rw 
stdin 

useful  for  chaining  filters  through  stdin/stdout 
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-start-date  and  -end-date 


-start-date 

Hour 

Day 

None 

Hour 

Hours  in  explicit  range 

Ignore  end- 
date  hour. 
Whole  days. 

Error 

-end-date 

Day 

End-hour  is  the  same 
as  start-hour. 
#hours  =  1, 25,  49,  ... 

Whole  days. 

Error 

None 

1  hour 

1  day 

Current  day  to 
present  time. 

(cert 
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How  Many  Files  are  Selected? 


#Files  =  Sensors 
x  Types 
x  Hours 
-  missing  files 


r\ 

'CERT 
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Partitioning  Parameters 


Flow  Record  Fields 


IP  Sets 


User  pmaps  and  Country  Codes 
Tuples 
Plugins 
PySiLK 
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Basic  Partitioning  Options 


•  Simple  numeric  fields:  ports,  protocol,  ICMP  Type 

•  Specified  IP  addresses,  CIDR  blocks,  &  wildcards 

•  Sets  of  IP  addresses 

•  Combinations  of  key  fields  -  Tuples 
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Simple  Numeric  Key  Fields 

--protocol 

--sport=  --dport=  --aport= 

#  source,  dest,  any 

--protocol=6,17 

#  TCP  orUDP 

--protocol=1  -5,7-1 6,1 8- 

#  not  TCP  orUDP 

--protocol=0- 

#  all  protocols 

--dport=80,443 

#  HTTP  or  HTTPS 

--sport=6000-6063,91 00-91 07  #  X1 1  or  JetDirect 

--aport=20,21 

#  FTP 

--sport=0-1023 

#  Well  Known  Ports 

r~r~  Software  Engineering  Institute  Carnegie  Mellon 
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ICMP  Types  and  Codes 


--icmp-type  major  type  of  ICMP  message 

--icmp-code  sub-type  of  ICMP  message 

--icmp-type=0,8  #  ping  request  &  reply 
--icmp-type=3  --icmp-code=4  #  fragm’n  needed 
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Specified  IP  address,  CIDR  block,  or 
wildcard 

--saddress=  --daddress=  --any-address= 
--not-saddress=  -not-daddress=  -not-any-address= 

May  specify  a  single: 

IP  address  192.0.2.1 

CIDR  block  192.0.2.0/24 

wildcard  pattern  172.1 6-31. x.1, 254 

addrs  in  same  subnet  203.0.1 13.1 .3,7,13,19 
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Specified  IP  addresses  or  CIDR  blocks 


--scidr=  --dcidr=  -any-cidr= 

--not-scidr=  ~not-dcidr=  --not-any-cidr= 

May  specify  multiple: 

IP  addresses  192.0.2.1,198.51.100.3 

CIDR  blocks  1 92.0.2.0/24, 1 98.51 . 1 00.0/24 

mixture  192.0.2.1,192.0.2.8/29 

NO  wildcard  patterns 
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Sets  of  arbitrary  addresses 


--sipset=  --dipset=  --anyset= 

--not-sipset=  --not-dipset=  --not-anyset= 

Specifies  the  name  of  a  file  storing  the  IP  set: 
--sipset=internalservers.set 
--dipset=RussianBizNtwk.set 
--anyset=T  orNodes.set 
--not-dipset=whitelist.set 
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Combinations  of  key  fields  -  Tuples 

--tuple-file=TorAuthSockets. tuple  --tuple-dir=reverse 


TorAuthSockets.tuple  file: 

SIP 

1 

sPort 

208.83.223.34 

1 

443 

82.94.251.203 

1 

80 

193.23.244.244 

1 

80 

194.109.206.212 

1 

80 

86.59.21.38 

1 

80 

128.31.0.34 

1 

9131 

171.25.193.9 

1 

443 

154.35.32.5 

1 

80 

212.112.245.170 

1 

80 

76.73.17.194 

1 

9030 

Software  Engineering  Institute  CarnegieMellon 
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rwfilter  output  options 


--pass-destination= 

--fail-destination= 

--all-destination= 


#  file  to  get  records  that  pass 

#  file  to  get  records  that  fail 

#  file  to  get  all  records 


--print-statistics  #  report  recs  read/pass/fail 

--print-volume-statistics  #  report  how  many 

#  recs/pkts/bytes  pass/fail 
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What  Is  This?  —  5 


rwfilter  --sensor=SO  -- type=in  \ 
--start=2009/4/2lT00  --end=2009/4/21T07  \ 
--daddress=10 .1.0. 0/16  --print-volume-stat 


1 

Rees  | 

Packets | 

Bytes | 

Files | 

Total | 

1436| 

2615| 

158084  | 

8| 

Pass  | 

1436| 

2615| 

158084 | 

1 

Fail  | 

0| 

0| 

0| 

1 

r*  i  = 

(cert  ~ 
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rwfilter  exercise 


1)  Find  all  traffic  captured  by  sensor  SO  going 
outbound  to  external  HTTPS  servers  on  April  21 , 
2009.  Save  these  flows  in  file  https0421  .rw 

2)  How  many  flow  records  matched  the  criteria? 
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rwfilter  exercise  solution 


rwfilter  --sensor=S0  --type=outweb  \ 
--start=2009/4/21  --dport=443  \ 
--pass=https0421  .rw  --print-volume-statistics 


Rees  | 

Packets | 

Bytes | 

Files | 

Total | 

436561 

1735501 

361743841 

24  | 

Pass  | 

123  | 

14201 

2880831 

| 

Fail  | 

435331 

1721301 

358863011 

| 

rwfileinfo  https0421  .rw  --fields=count 


https0421 . rw: 

count-records  123 
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Output  Criteria 


rwfilter  leaves  the  flows  in  binary  (compact)  form. 

•  — pass,  — fail:  direct  the  flows  to  a  file  or  a  pipe 

•  — all:  destination  for  everything  pulled  from  the  repository 


•  — print-filenames, 

— print-missing-files 

•  — print-statistics  or 

— print-volume-statistics 


•  One  output  is  required  but  more  than  one  can  be  used 
(no  screen  allowed). 

Other  useful  output 


Repository 


fail 


Software  Engineering  Institute 


Carnegie  Mellon 


Chaining  Filters 


Repository 


It  is  often  very  efficient  to  chain 
rwfilter  commands  together: 

•  Use  --pass  and  — fail  to 
segregate  bins. 

•  Use  — all,  so  you  only  pull  from 
the  repository  once. 
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What  Is  This?  —  8 


rwfilter  \ 

--start-date=2010/12/08  \ 
- - type=ou tweb  \ 

- -by tes=l 00000-  \ 
--pass=stdout  \ 

|  rwfilter  \ 
stdin  \ 

--duration=60-  \ 
--pass=long-http . rw  \ 
--fail=short-http . rw 
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Tips  with  rwf ilter 


Narrow  time,  type,  and  sensor  as  much  as  possible  (fewer 
records  to  check). 

Include  as  many  partitioning  parameters  as  possible  (easy  to 
be  vague  and  get  too  much  data). 

Can  do  multiple  queries  and  merge  results 

Can  do  further  filtering  to  narrow  results 

Iterative  exploration 
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Example  Typos 


--port= 

--destport= 

--sip=  or  — dip= 

--saddress=danset.set 

-start-date=2006/06/1 2-end-date 

-start-date  =  2006/06/12 

start-date=2006/06/1 2 

— start-date=2006/06/1 2 

-start-date=2005/1 1  /04:06:00:00 
-end-date=2005/05/2 1 : 1 7:59:59 

—  Software  Engineering  Institute  Carnegie  Mellon 


No  such  keywords 

Needs  value  not  filename 
Spaces  needed 
No  spaces  around  equals 
Need  dashes 
Only  two  dashes 
Only  down  to  hour 
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SiLK  Commandments 


1. Thou  shalt  use  Sets  instead  of  using  several  rwfilter  commands  to  pull  data  for 
multiple  IP  addresses 

2. Thou  shalt  store  intermediate  data  on  local  disks,  not  network  disks. 

3. Thou  shalt  make  initial  pulls  from  the  repository,  store  the  results  in  a  file,  and 
work  on  the  file  from  then  on.  The  repository  is  slower  than  processing  a  single 
file. 

4. Thou  shalt  work  in  binary  for  as  long  as  possible.  ASCII  representations  are 
much  larger  and  slower  than  the  binary  representations  of  SiLK  data. 

s.Thou  shalt  filter  no  more  than  a  week  of  traffic  at  a  time.  The  filter  runs  for 
excessive  length  of  time  otherwise. 

6. Thou  shalt  only  run  a  few  rwfilter  commands  at  once. 

7. Thou  shalt  specify  the  type  of  traffic  to  filter.  Defaults  work  in  mysterious  ways. 
s.Thou  shalt  appropriately  label  all  output. 

9.Thou  shalt  check  that  SiLK  does  not  provide  a  feature  before  building  your  own. 


r*  i  = 

'CERT  ~ 
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Basic  SiLK  Counting  Tools:  rwcount, 

rwstats,  rwuniq 


“Count  [volume]  by  [key  field]  and  print  [summary]” 

•  basic  bandwidth  study: 

“Count  bytes  by  hour  and  print  the  results.” 

•  top  10  talkers  list: 

“Count  bytes  by  source  IP  and  print  the  10  highest  IPs.” 

•  user  profile: 

“Count  records  by  dIP-dPort  pair  and  print  all  the  pairs.” 

•  potential  scanners: 

“Count  unique  dIPs  by  sIP  and  print  the  sources  that 
contacted  more  than  100  destinations.” 
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Bins 


Trip 

miles 


Trip 

miles 


7  r 

sedan  ,  coupe 


Trip 

miles 


For  motor  vehicle  trips  we  could  bin  by: 

Vehicle  style  -  sedan,  coupe,  SUV,  pickup,  van 

Highway  or  city  trip 

Personal  or  business  trip 

We  could  measure  the  trips 
and  aggregate  in  bins: 

total  miles 

fuel  consumption 

oil  consumption 

pollutant  emission 

CEF^  —  Software  Engineering  Institute  |  CurnvgicMclion 


T 

pickup 


Total  I  Total 


miles  ■  miles 


1 


Total 

miles 


r 


f 


http://www.prlog.org/10991533-great-value-good-looking- 

colour-coded-recycling-bins-exclusive-to-imrubbishcouk.html 
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Bins 


For  flows  we  could  bin  by: 

address  or  address  block 
port 

protocol 
time  period 

We  could  measure  the  flows  and  aggregate  in  bins: 
count  of  flow  records,  packets,  bytes 
count  of  distinct  values  of  other  fields,  e.g.,  addr 
earliest  sTime,  latest  eTime 
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Bins 


Packet  count 


Packet  count 


TCP  UDP  ICMP 

aoHlilBHiJi  _ _  _ _ 


Total 

packets 


Total 

packets 


M 


Total 

packets 


_ Software  Engineering  Institute  Carnegie  Mellon 


Value 

from  flow  record 

e.g.,  packets 

Bin  key  field 

e.g.,  protocol 

Aggregate 

Value 
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Basic  SiLK  Counting  Tools: 


rwcount:  count  volume  across  time  periods 

rwstats:  count  volume  across  IP,  port,  or  protocol  and  create 
descriptive  statistics 

rwuniq:  count  volume  across  any  combination  of  SiLK  fields 

“Key  field”  =  SiLK  fields  defining  bins 
“Volume”  =  {Records,  Bytes,  Packets}  and  a  few  others 
measure 
aggregate  value 

Each  tool  reads  raw  binary  flow  records  as  input. 
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rwcount 


■  count  records,  bytes,  and  packets  by  time  and  display 
results 

■  fast,  easy  way  of  summarizing  volumes  as  a  time  series 

■  great  for  simple  bandwidth  studies 

■  easy  to  take  output  and  make  a  graph  with  graphing  S/W 


Software  Engineering  Institute  Carnegie  Mellon 
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arch/tau/docs/paraprof/ch05s02 
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Time  Bins 


When  binning  by  time,  you  must  specify  the  period  of 
time  for  each  bin.  This  is  called  the  bin-size. 

It’s  the  size  of  the  bin’s  opening,  not  the  volume  of 
the  container.  ◄-bin-size-^ 


rwcount 

The  bin  key  is  always  time.  You  choose  the  period. 

The  aggregate  measures  are  chosen  for  you.  They 
are  flows/records,  bytes,  packets. 


rwfilter  — sensor=S0  --start=2009/4/21  \ 
--type=in  --proto=l  --pass=stdout  \ 

I  rwcount  --bin-size=3600 


Date 

Records 

Bytes 

Packets 

•  •  • 

2009/04/21T13:00:00 

10.00 

2460 . 00 

41.001 

2009/04/21T14 : 00 : 00 

29.00 

8036.00 

107 . 00  | 

2009/ 04/ 21T15 : 00 : 00 

22 . 00 

2214 . 00 

47.001 

2009/04/21T16:00:00 

10 . 00 

1586 . 00 

23.001 

Software  Engineering  Institute 


Carnegie  Mellon 


©  2014  Carnegie  Mellon  University 


What  Is  This?  —  9 


rwcount  MSSP.rw  --bin-size=3600 


Date  | 

Records | 

Bytes | 

Packets | 

2010/12/08T00 : 00 : 00 | 

1351571.661 

73807086.401 

1606313.611 

2010/12/08T01 : 00 : 00 | 

1002012.431 

54451440.591 

1185143.621 

2010/12/08T02 : 00 : 00 | 

1402404.611 

77691865.261 

1675282.27| 

2010/12/08T03 : 00 : 00 | 

1259973.651 

68575249.901 

1491393.081 

2010/12/08T04 : 00 : 00  | 

939313.561 

51410968.241 

1118584.811 

2010/12/08T05 : 00 : 00 | 

459564.75| 

80862273.32 | 

1742058.621 

2010/12/08T06 : 00 : 00 | 

1280651.231 

69881126.411 

1519435.24 | 

(cert 
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rwcount  Demo 


The  shell  can  help  with  the  arithmetic:  $((24*60*60)) 

You  also  can  find  common  periods  in  the  Quick 
Reference  Guide. 

Time  series  for  all  outgoing  traffic  on  SO: 

rwfilter  --sensor=S0  --type=outjOutweb  \ 

- - start=2009/04/21  --end=2009/04/23  \ 
--proto=0-  --pass=stdout  \ 

|  rwcount  --bin-size=$((24*60*60)) 
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rwcount  Exercise 


Produce  a  time-series  with  30-minute  intervals, 
analyzing  incoming  ICMP  traffic  collected  at  sensor 
SO  on  April  21, 2009. 
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rwcount  Exercise  solution 


rwfilter  --sensor=S0  --type=in,inicmp  \ 
--start=2009/04/21  - -proto=l  \ 
--pass=stdout  \ 

|  rwcount  --bin-size=1800 

Date | Records |  Bytes | Packets | 

•  •  • 


2009/04/21T13: 00:00| 

5.00| 

960.001 

16.00 

2009/04/21T13: 30:00| 

5.00| 

1500.001 

25.00 

2009/04/21T14: 00:00| 

22.00| 

3900.001 

65.00 

2009/04/21T14: 30:00| 

7.00| 

4136.001 

42.00 

2009/04/21T15: 00:00| 

6.00| 

364.001 

13.00 

2009/04/21T15: 30:00| 

16.00| 

1850.001 

34.00 

2009/04/21T16: 00:00| 

•  •  • 

8.00| 

934.001 

19.00 
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rwcount  -load-scheme 


How  are  records,  packets,  and  bytes  allocated  to 
flows  that  span  time  bins? 


+  ♦  Flow 

Scheme 

0 
1 
2 

3 

4  _ I 

5 

6 
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rwcount  -load-scheme 

How  do  I  choose  a  loading  scheme?  (hint:  use  default) 

0  —  Average  load/bin  (smooth  peaks/valleys  among  bins) 

1  -  Flow  onset  /  periodic  behavior  emphasis 

2  -  Emphasize  flow  termination 

3  -  Emphasize  payload  transfer  above 
setup/termination 

4  —  Average  load/time  (smooth  peaks/valleys  over  time) 

5  -  Worst  case  service  loading 

6  -  Best  case  service  loading 

Most  commonly  used  schemes  are:  4,  0,  1 
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Calling  rwstats 


rwstats  --overall -stats 

•  Descriptive  statistics  on  byte  and  packet  counts  by  record 

•  See  “man  rwstats”  for  details. 

rwstats  --fields=KEY  - - value=VOLUME 

-- court t=N  or  --threshold=N  or 
- -per centage=N 
[--top  or  --bottom] 

•  Choose  one  or  two  key  fields. 

•  Count  one  of  records,  bytes,  or  packets. 

•  Great  for  Top-N  lists  and  count  thresholds 

•  (standard  output  formatting  options  -  see  “man  rwstats”) 
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What  Is  This? -10 


rwfilter  outtraffic . rw  \ 

--stime=2010/12/08T18 : 00 : 00-2010/12/08T18 : 59 : 59  \ 
--pass=stdout  \ 

|  rwstats  --fields=sip  - -values=by tes  --count=10 

INPUT:  1085277  Records  for  1104  Bins  and  4224086177  Total  Bytes 
OUTPUT:  Top  10  Bins  by  Bytes 


SlP| 

Bytes | 

%Bytes | 

cumul  % | 

71.55.40 

.62| 

1754767148 | 

41.541935| 

41.541935| 

71.55.40. 

169| 

1192063164 | 

28.220617 | 

69.762552 | 

71.55.40. 

179| 

331310772 | 

7.843372| 

77.605923| 

71.55.40. 

204  | 

1709662781 

4.047415| 

81.653338| 

177.249.19. 

217  | 

122975880 | 

2.911301| 

84.564639| 

71.55.40 

.72  | 

110726717 | 

2.621318| 

87.185957| 

71.55.40. 

200| 

101593627 | 

2.405103| 

89.591060| 

177.71.129. 

255| 

40166574 | 

0.950894| 

90.541954 | 

71.55.40 

.91| 

35316554 | 

0.836076| 

91.378030| 

149.249.114. 

204  | 

26634602| 

0.630541| 

92.0085711 

(cert 
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rwstats  Exercise  1 


What  are  the  top  10  incoming  protocols  on  April  22, 
2009,  collected  on  sensor  SO? 
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rwstats  Exercise  1  solution 


rwfilter  --sensor=S0  --type=in, inweb  \ 
--start=2009/04/22  --prot=0-  -pass=stdout  \ 

|  rwstats  --fields=protocol  --value=rec  --count=10 

INPUT:  337595  Records  for  4  Bins  and 

337595  Total  Records 

OUTPUT:  Top  10  Bins  by  Records 


pro  | 

Records | 

%Records | 

cumul  % | 

6  1 

336037 | 

99.538500 | 

99.538500  | 

17  | 

1467  | 

0 .434544 | 

99.973045 | 

1  1 

88  | 

0 . 026067 | 

99.999111  | 

132  | 

3  I 

0 . 000889 | 

100 . 000000  | 
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rwstats  Exercise  2 


Top  10  inside  hosts  according  to  how  many  outside 
hosts  they  communicate  with. 

Use  --value=distinct:dip 
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rwstats  Exercise  2  solution 


rwfilter  --sensor=S0  --type=out,outweb  -proto=0-  \ 
-start-d =2009/4/22  --pass=stdout  \ 

|  rwstats  — fields=sip  --value=distinct:dip  --count=10 

INPUT:  313028  Records  for  7  Bins 


OUTPUT:  Top 

10  Bins  by 

dIP-Distinct 

SIP  | 

dIP-Distin | 

%dIP-Disti | cumul 

O  1 
°  1 

10.1.60 .187 | 

50  | 

?  | 

9  1 

• 

10.1.60.51 

26  | 

?  | 

9  1 

• 

10.1.60.251 

17  | 

?  | 

9  1 

• 

10.1.60.731 

14  | 

?  | 

9  1 

• 

10.1.60 .191 | 

11  | 

?  | 

9  1 

• 

10.1.60.2511 

9  | 

?  | 

9  | 

• 

10.1.60. 132 | 

3  1 

?  | 

9  1 

• 

--no-percents  will  clean  up  the  question  marks. 
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rwuniq 


Unlike  rwstats,  rwuniq  will  display  all  the  bins,  not  just 
the  top  or  bottom  N  bins. 

Output  is  normally  unsorted,  --sort-output  causes 
sorting  by  the  key  (bin),  unlike  rwstats  which  sorts  by 
aggregate  value. 
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Calling  rwuniq 


rwuniq  --fields=KEY  - - value=VOLUME 

•  Choose  one  or  several  key  fields. 

•  Aggregate  volume  count:  records,  bytes,  or  packets. 

•  (standard  output  formatting  options  -  see  “man  rwuniq”) 


Apply  thresholds  to  bins  before  outputting: 

•  --bytes,  --packets,  --flows,  --sip-distinct, 
--dip-distinct 

•  Specify  minimum  aggregate  value  or  a  range 


--sort-output  by  key  (rwstats  sorts  by  value) 
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What  Is  This? -11 


rwf liter  outtraf f ic . rw  \ 

--stime=2010/12/08: 18: 00: 00-2010/12/08 

--saddress=71 . 55 . 40 . 62  --pass=stdout  \ 
|  rwuniq  --fields=dip, sport  --all-counts 


dIP  | 

sPort 

1 

Bytes | 

Packets | 

12.113.41.1901 

80 

1 

12782 | 

20  | 

30.182.228.1431 

80 

1 

203907933 | 

143611 | 

37.153.24.2291 

80 

1 

205628625 | 

1448291 

82.180.203.87 | 

80 

1 

213013145 | 

1508961 

82.180.203.197 | 

80 

1 

800  | 

8  1 

88.124.166.2331 

80 

1 

2239303691 

1582761 

88.124.166.2331 

443 

1 

509285 | 

732  | 

94.239.226.247 | 

80 

1 

124833037 | 

96047 | 

109.95.61.801 

80 

1 

8467397 | 

6325| 

139.65.186.41 

80 

1 

204123360 | 

143794 | 

139.177.10.1361 

80 

1 

407978375 | 

287354 | 

198.237.16.1721 

80 

1 

159066748 | 

1120251 

219.149.72.154 | 

1024 

1 

44  | 

1 1 

249.216.88.1721 

80 

1 

88  | 

2  1 

250.211.100.88 | 

80 

1 

3295160 | 

2492  | 

(cert 
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-sort-output 


lecords 

4 

2 

2 

92 

2 

97 

43 

3 

90 

3 

6 

1 

1 

2 

42 


sTime -Earliest 
2010/12/0 8T18:42:51 
2010/12/08T18:53:59 
2010/12/0 8T18:29:11 
2010/12/08T18:06:36 
2010/12/0 8T18:43:30 
2010/12/08T18:08:55 
2010/12/0 8T18:06:57 
2010/12/0 8T18:25:22 
2010/12/0 8T18:08:59 
2010/12/0 8T18:19:48 
2010/12/0 8T18:20:03 
2010/12/0 8T18:18:43 
2010/12/08T18:50:40 
2010/12/0 8T18:44:42 
2010/12/0 8T18:47:50 


eTime -Latest 
2010/12/08T18 : 58 : 49 
2010/12/08T19 : 01 : 47 
2010/12/08T18 : 42 : 51 
2010/12/08T18 : 32 : 33 
2010/12/08T18 : 43 : 30 
2010/12/08T18 : 32 : 25 
2010/12/08T18 : 51 : 11 
2010/12/08T19 : 21 : 34 
2010/12/08T18 : 10 : 09 
2010/12/08T18 : 26 : 36 
2010/12/08T19 : 01 : 30 
2010/12/08T18 : 46 : 55 
2010/12/08T18 : 50 : 40 
2010/12/08T18 : 44 : 47 
2010/12/08T18 : 58 : 53 
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What  Is  This? -12 


rwuniq  outtraf f ic . rw  --fields=dip  \ 

--values=sip-distinct, records , bytes  --sip-distinct=400-  \ 
--sort-output 


dIP | sIP 

-Distin | 

Bytes | 

Records | 

13.220.28.1831 

512  | 

20480| 

512  | 

171.128.2.27  | 

448| 

190692801 

476732 | 

171.128.2.1791 

448| 

1395012001 

34875301 

171.128.212.14  | 

448| 

139467440 | 

34866861 

171.128.212.1241 

448| 

127664480 | 

3191612 | 

171.128.212.1271 

448| 

666115601 

16652891 

171.128.212.1881 

448| 

1394676801 

3486692| 

171.128.212.2281 

448| 

1393931601 

34848291 

245.225.153.1201 

763| 

30520| 

763| 

245.238.193.102 | 

1339| 

1794801 

4487  | 

(cert 
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rwuniq  vs.  rwstats 


_ both _ 

Bin  by  key _ 

Default  aggregate  value  is  flows 
(records) _ 

Choose  which  bins  have 
aggregate  values  significant 

enough  to  output. _ 

Show  volume  aggregate  value[s] 

--bin-time  to  adjust  sTime  and 

eTime _ 

--presorted-input  (omit  when 
value  includes  Distinct  fields, 

even  if  input  is  sorted) _ 

~values=sTime-Earliest,  -values=Records,  Packets,  Bytes, 

eTime-Latest  sIP-Distinct,  dIP-Distinct, 

Distinct: KEY-FI ELD  (KEY-FIELD 

_ can't  also  be  key  field  in  -fields) 
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Sorted  by  primary  aggregate 
value _ 

--count,  --threshold,  -percentage 

-no-percents  (good  when 
primary  aggregate  isn't  Bytes, 
Packets,  or  Records) 


-sort-output  by  key 

otherwise  unsorted _ 

Thresholds  or  ranges:  -bytes, 
-packets,  -flows,  -sip-distinct, 

-dip-distinct _ 

-all-counts  (bytes,  pkts,  flows, 
earliest  sTime,  and  latest  eTime) 


rwstats  in  top/bottom  mode 
-top  or  -bottom  bins 


_ rwuniq _ 

all  bins  except  per  thresholds 


Blacklists,  Whitelists,  Books  of  Lists... 


Too  many  addresses  for  the  command  line? 

•  spam  block  list 

•  malicious  websites 

•  arbitrary  list  of  any  type  of  addresses 

Create  an  IP  set! 

•  individual  IP  address  in  dotted  decimal  or  integer 

•  CIDR  blocks,  192.168.0.0/16 

•  wildcards,  10. 4, 6.x. 2-254 

Use  it  directly  within  your  filter  commands. 

•  --sipset,  --dipset,  --anyset 
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Set  Tools 

rwsetbuild:  Create  sets  from  text. 

rwset:  Create  sets  from  binary  flow  records. 

rwsetcat:  Display  an  IP  set  as  text. 

rwsetmember:  Test  if  an  address  is  in  given  IP  sets. 

rwsettool:  Perform  set  algebra  (intersection,  union, 
set  difference)  on  multiple  IP  sets. 
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Set  Intersection 


Web  Servers  DNS  Servers 


rwsettool  --intersect  web. set  dns.set  --output  web_and_dns.set 
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Set  Union 


Web  Servers  DNS  Servers 


rwsettool  --union  web. set  dns.set  --output  web_or_dns.set 


fr.  FRT 


CERT  I  ~ 
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Set  Difference 


Web  Servers  DNS  Servers 


rwsettool  -difference  web. set  dns.set  -output  web_not_dns.set 
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What  Is  This?  —  6 


more  MSSP . txt 

171.128.2.0/24 

171.128.212.0/24 


rwsetbuild  MSSP.txt  MSSP. set 


rwf liter  --start=2010/12/8  --anyset=MSSP . set  \ 
- -pass=MSSP . rw  --print-vol 


|  Rees | 

Total |  307671881 

Pass j  26678669| 

Fail |  40885191 


Packets | 
81382782 | 
31743084 | 
496396981 


Bytes | Files | 
354784079501  48 | 

14649646761  | 

34013443274|  | 


rwset  --sip-file=MSSPsource . set  MSSP.rw 

rwsettool  --intersect  MSSP. set  MSSPsource . set  \ 
--output=activeMSSP . set 

rwsetcat  --count-ips  activeMSSP . set 

22 
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What  Is  This?  —  7 


rwfilter  --sensor=SO  --type=out  \ 
--start=2009/ 4/21  --proto=0-  \ 
--pass=stdout  \ 

|  rwset  --dip-f ile=outIPs . set 
rwsetcat  outIPs.set  - -network- structure=l 6 

10.1.0.0/161  8748 
10.2.0.0/161  27 
140.13.0.0/161  1 


i  = 

(cert  - 
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Set  Exercise  1 


Make  a  set-file  of  addresses  of  all  actual  inside  hosts. 
Should  we  examine  incoming  or  outgoing  traffic? 
Make  a  set-file  of  all  outside  addresses. 

Can  you  make  both  sets  with  one  command? 
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Set  Exercise  1  solution 


rwfilter  --sensor=S0  --type=out,outweb  \ 
--start-d=2009/4/21  --end=2009/4/23  \ 
--proto=0-  --pass=stdout  \ 

|  rwset  --sip-file=insidehosts.set  \ 

- -dip-f ile=outsidehosts . set 
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Set  Exercise  2 


Examine  the  two  set-files. 
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Set  Exercise  2  solution 


Is  -1  insidehosts.set 
rwfileinfo  insidehosts.set 
rwsetcat  insidehosts.set 

Is  -1  outsidehosts.set 
rwsetcat  outsidehosts.set  |  less 
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Set  Exercise  3 


Which  / 24  networks  are  on  the  inside? 
Which  /24  networks  are  on  the  outside? 


Software  Engineering  Institute 


Carnegie  Mellon 


©  2014  Carnegie  Mellon  University 


116 


Set  Exercise  3 


rwsetcat  -- network -st rue 
rwsetcat  --network-struc 
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Advanced  Partitioning  Options 


•  TCP  Flags 

•  Count  of  packets  and  bytes 

•  Time 

•  Extending  rwfilter’s  partitioning  options  with 
plugins 
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TCP  Flags 


S  -  Syn  (synchronize) 

U  -  Urg  (urgent) 

R  -  Rst  (reset) 

F  -  Fin  (finish) 

P  -  Psh  (push) 

A  -  Ack  (acknowledge) 

C  -  CWR  (congestion  window  reduced) 

E  -  ECE  (explicit  congestion  notification  echo) 
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TCP  Flags 

— flags-initial=  #  TCP  flags  in  1 st  pkt  of  flow 

--flags-session=  #  flags  in  remaining  packets 
— flags-all=  #  flags  in  all  pkts  of  flow 

=flagsOn/flagsExamined 

flagsOn:  TCP  flags  that  must  be  On  to  pass. 

flagsExamined:  flags  under  consideration  for  passing. 

Any  flags  in  flagsOn  must  also  be  in  flagsExamined. 

Flags  in  flagsExamined,  but  not  in  flagsOn,  must  be 
off  to  pass. 
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TCP  Flags 


— flags-initial=S/SA  #  flow  from  client  to  server 

--flags-initial-SA/SA  #  flow  from  server  to  client 

--flags-init=S/SA  --flags-session=F/F  #full  C->S  flow 
--flags-init=SA/SA  -flags-session=F/F  #full  S->C  flow 
--flags-all=S/SFR  #  incomplete  flow 

--flags-all=/FR  #  unfinished  flow  fragment 
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Count  of  Packets  and  Bytes 

--packets=  #  packets  in  the  flow 

--bytes=  #  bytes  in  the  packets  in  flow 

--bytes-per-packet=  #  average 

--packets=3- 

--bytes=40-570 

--bytes-per-packet=40.0-75. 1 25 
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Partitioning  by  Time 


"St\me=earliertime-latertime 

— eti  me=earliertime-latertime 

~act\ve-time=earliertime-latertime 

~d  u  ration =lowseconds-highseconds 

stime  and  etime  are  usually  not  used  together. 

Each  time  has  millisecond  resolution. 

-stime=2009/4/21T1 3:00-2009/4/21  T1 3:29  #  y2  hr 

-etime=2009/4/21T1 3:00:00-2009/4/21  T1 3:00:09  #  10  sec 
-stime=2009/4/21T1 3:00-2009/4/21  T1 3:00:48.725  #  48.725s 
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Extending  Partitioning  with  Plugins 


rwfilter‘s  partitioning  capabilities  can  be  extended 
with  plugins  written  in  Python  or  C. 

--python-expr=  #  simple  python  expression 
--python-file=  #  complex  python  pgm  in  a  file 

— plugin=  #  compiled  C  program  in  a  file 

--python-expr='rec.sport  ==  rec.dport' 

--python-file=clientserver_filt.py 

--plugin=app-mismatch.so 
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I  Only  Believe  What  I  See 


You’ll  be  tempted  to  work  with  text-based  records. 

•  It’s  easy  to  see  the  results  and  post-process  with  other  tools 
(e.g.,  Perl,  awk,  sed,  sort). 

•  It  takes  a  lot  of  space,  and  it’s  much,  much  slower. 

Guiding  principle:  Keep  flows  in  binary  format  as  long 
as  possible. 
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What  Is  This?  — 13 


rwfilter  --type=out  \ 
--start=2010/12/08  \ 
--aport=22  --pass=ssh . rw 

rwfilter  --dport=22  ssh.rw  \ 
--pass=stdout  |  rwcut 

rwfilter  --sport=22  ssh.rw  \ 
--pass=stdout  |  rwcut 
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Outline  —  4 
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PySiLK— Using  SiLK  with  Python 


•  PySiLK — an  extension  to  Python 

•  Allows  Python  to  manipulate  SiLK’s  data  files 

•  Uses  the  “silk”  python  module,  from  SEI  CERT. 
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PySiLK  components 


PySiLK 

•  Read,  manipulate,  and  write  SiLK  Flow  records, 
IPsets,  Bags,  and  Prefix  Maps  (pmaps)  from  within 
Python 

SilkPython  (--python-file=) 

•  Create  plug-ins  for  rwfilter  or  other  SiLK  utilities. 

•  Create  partitioning  switches  for  rwfilter 

•  Create  new  flow-record  fields  for  other  utilities 

--python-expr= 

•  Create  a  simple  partitioning  test  without  creating  a 
new  switch 
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Stand-alone  PySiLK  example 


#!  /usr/bin/env  python 
import  silk 

myfile  =  silk.silkfile_open("MyFlows.rw",  silk.READ) 
for  rec  in  myfile: 

if  rec.sport  <  2500  and  rec.sport  ==  rec.dport: 
print  rec.sport,  rec.stime,  rec.sip,  rec.dip 
myfile.  closeQ 
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PySiLK  exercise 


Write  a  Python  program  which  reports  the  source  IP 
address  associated  with  the  lowest  source  port  used 
by  any  flow  record  in  the  file  MyFlows.rw. 
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PySiLK  exercise  solution 


#!  /usr/bin/env  python 
import  silk 

lowsport  =  65536  #  could  use  99999 

myfile  =  silk.silkfile_open("MyFlows.rw",  silk.READ) 

for  rec  in  myfile: 

if  rec.sport  <  lowsport: 
lowsport  =  rec.sport 
lowsip  =  rec.sip 
myfile. close() 
print  rec.sport,  rec.sip 
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--python-expr  example 


rwfilter  sample. rw  \ 

--protocol=6  \ 

--python-expr='rec.sport  ==  rec.dport'  \ 
--pass=equalTCPports.rw 
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SilkPython  example  (i) 

import  silk 
def  lowerport(rec): 

if  rec.sport  <  rec.dport: 
return  rec.sport 

else: 

return  rec.dport 

register_int_field("lport",  lowerport, 
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SilkPython  example  (2) 


rwstats  --python-file=lowport.py  — fields=lport  \ 
--value=records  --count=10  flows. rw 
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SilkPython  exercise 


Write  a  plug-in  for  rwcut,  rwstats,  etc.  The  plug-in 
should  define  a  new  flow-record  field  which  contains 
the  IP  address  of  the  host  using  the  lower  port 
number  in  the  flow.  You’ll  need  the  following 
SilkPython  function: 

register_ip_field(field_name,  ip_function) 
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SilkPython  exercise  solution 

import  silk 

def  lowerport_ip(rec): 

if  rec.sport  <  rec.dport: 
return  rec.sip 

else: 

return  rec.dip 

register_ip_field("lip",  lowerportjp) 
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Alternatives  to  PySiLK 


•  SiLK  tools 

•  Not  as  flexible  criteria  as  Python. 

•  Could  use  tuple  files 

•  Must  be  maintained 

•  Aren’t  self-contained  with  logic 

•  Large  tuple  files  run  slower  than  Python. 

•  Text  processing  with  Perl,  C,  or  Java 

•  Create  text  with  rwcut  delimited  without  titles 

•  Convert  ports  back  to  integers 

•  Dealing  with  dates,  times,  or  addresses  difficult 
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Modified  example  of  PySilk 


•  Summarize  the  selection  as  a  count  by  port 

•  Just  keep  a  Python  dictionary 

•  Key  =  port  number 

•  Value  =  count 
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PySiLK  advantages 


•  Speeds  both  programming  and  processing 

•  Keeps  data  in  binary,  unlike  Perl  &  C 
•  No  parsing  text 

•  Built-in  conversions  of  objects  to  strings 

•  Full  power  of  Python 

•  Good  for: 

•  Stateful  filters  and  output  options 

•  Integrate  SiLK  with  other  data  types 

•  Complex  or  branching  filter  rules 

•  Custom  key  fields  and  aggregators  for  rwcut,  rwsort 
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Furthering  Your  SiLK  Analysis 
Skills  (i) 


Each  tool  has  a  — help  option. 
SiLK  Reference  Guide 
SiLK  Analysts’  Handbook 

•  Both  available  at  the  SiLK  tools  website 
http://tools.netsa.cert.org 

Email  support 

•  silk-help@cert.org 
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Furthering  Your  SiLK  Analysis 
Skills  (2) 


Tool  tips 

•  SiLK  Tooltips  link  on  http://tools.netsa.cert.org 

Flow  analysis  research  and  advanced  techniques 

•  http://www.cert.org/flocon 

•  http://www.cert.org/netsa 
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Questions? 
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Contact  Information 


Ron  Bandes  —  rbandes@cert.org 
Software  Engineering  Institute 
Carnegie  Mellon  University 
Pittsburgh,  PA 


Software  Engineering  Institute 


Carnegie  Mellon 


©  2014  Carnegie  Mellon  University 


145 


